High Performance Memory System for High Ilp Microarchitectures

نویسنده

  • Augustus K. Uht
چکیده

A new memory system is proposed that greatly widens the von Neumann bottleneck for uniprocessors with high bandwidth requirements, especially those processors exhibiting large degrees of Instruction Level Parallelism (ILP). The new system has high bandwidth and low latency, and is not costly. Further, minimal ordering of memory accesses is achieved, in that only accesses to the same main memory location are sequentialized; accesses to diierent addresses are independent. Memory-mapped I/O accesses are also taken into consideration. The system is described in an overview, discussing its general architecture, and then the details of the communications protocol are given. Lastly, performance of the system is discussed.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Construct of Interlanguage Pragmatic Learning Strategies: Investigating Preferences of High vs. Low Pragmatic Performers

Interlanguage pragmatics (ILP) has witnessed a growing body of research in the past two decades. One of the under-explored domains of L2 pragmatics is the role of learning strategies specifically tailored for the development of ILP knowledge. Therefore, this investigation aimed to determine the significant interlanguage pragmatic learning strategies (IPLS) used by high vs. low L2 pragmatic achi...

متن کامل

How to build scalable on-chip ILP networks for a decentralized architecture

The era of billion transistors-on-a-chip is creating a completely different set of design constraints, forcing radically new microprocessor architecture designs. This paper examines a few of the possible microarchitectures that are capable of obtaining scalable ILP performance. First, we observe that the network that interconnects the processing elements is the critical design point in the micr...

متن کامل

How to build scalable on-chip ILP networks for a decentralized architecture

The era of billion transistors-on-a-chip is creating a completely different set of design constraints, forcing radically new microprocessor architecture designs. This paper examines a few of the possible microarchitectures that are capable of obtaining scalable ILP performance. First, we observe that the network that interconnects the processing elements is the critical design point in the micr...

متن کامل

Exploiting Instruction-Level Parallelism for Memory System Performance

Current microprocessors improve performance by exploiting instruction-level parallelism (ILP). ILP hardware techniques such as multiple instruction issue, out-of-order (dynamic) issue, and non-blocking reads can accelerate both computation and data memory references. Since computation speeds have been improving faster than data memory access times, memory system performance is quickly becoming ...

متن کامل

Increasing communication performance with a minimal-copy data path supporting ILP and ALF

Many current implementations of communication subsystems on workstation class computers transfer communication data to and from primary memory several times. This is due to software copying between user and operating system address spaces, presentation layer data conversion and other data manipulation functions. The consequence is that memory bandwidth is one of the major performance bottleneck...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997